                          PERFMON
                      profiling tool
                             by
                     GUENTHER STRASSER
                     61814088 AT VIEVMA


HOW TO USE PERFMON
==================

PACKAGE

The package consists of following parts:

1. PERFPRS.EXE

This is a filter program which adds profiling hooks to all C functions.
It must be used after the preprocessor run and the compilation step.

2. PERFMON.EXE

This file reads and analyses the run-time output which is produced by
a program which was compiled with PERFPRS.EXE.

3. PERFTSTM.DLL

This module includes functions which measure time, maintain a call index
and produce the analyses output files.

4. PERFTSTM.LIB

Each object which was compiled with PERFPRS.EXE must be linked with this
library.

5. CL(X).EXE

To ease the use of the source filter this program can be used instead
of the MSC 6.0 CL.EXE. However it has its problems therefore the source
is supplied which may be adapted to special needs.

COMPILATION

Unpack the PERFMN.ZIP file with PKUNZIP. You get the files listed above.

Recompile all files in which you are interested in the following form:

CL <MSC flags> /EP MYFILE.C | PERFPRS > TMP.C
CL <MSC flags> /FoMYFILE.OBJ TMP.C

All flags which are normally used must be applied to the precompilation
step, too, because they may influence the way the precompiler translates
C macros and variables. The /EP switch tells MSC to perform the preprocessor
step only and to write the result to standard output. The /Fo tells MSC
how to name the object file. The CLX.EXE can be used to do this for you.
You may copy it to the directory where you compile and rename it to CL.EXE.
In this case you do not need to change your makefiles. CLX searches for the
original CL.EXE and uses it to perform the compilation.

PERFPRS.EXE adds a function call to a profiling function on each entry and
exit of each function. Thus, when the function is called, a profiling
record can be produced, even if the function is called indirectly or from
a place which was not compiled using PERFPRS. The profiling functions are
placed in PERFTSTM.DLL, which contains many other functions. Therefore
it is that fat, the profiling functions itself are rather small.

LINKAGE

To only needed modification to your build process is you must link
PERFTSTM.LIB to your EXE (or DLL) and that PERFTSTM.DLL must be reachable
via LIBPATH.

RUN-TIME

When you run your program PERFMON produces two files in the current subdirectory:
PERFMON.DBX and PERFMON.NMS. The first contains profiling records, the second
contains an index of used functions. PERFMON.DBX may grow during the execution
of your application. OS/2 writes and closes the files only if your application
completes successfully. Otherwise the files may be corrupted.

ANALYSES

After the application is finished use perfmon to analyze the files. PERFMON
loads them from the current subdirectory, therefore you are not prompted for
a special location. The result of the analyses is written to standard output
and should be redirected to a file or a printer queue. Following lines are
produced for each called function:

   # Calls      Self       Total    Self/Call  Total/Call       function name

======================================================================================
[0002] EdtEdit
======================================================================================
Parents:
          1      10373      11282      10373      11282   EdtInit [0079]
          1          0      17094          0      17094   EvDoubleClick [0069]
--------------------------------------------------------------------------------------
          2      10373      28376       5186      14188   EdtEdit
--------------------------------------------------------------------------------------
Children:
          4          0          0          0          0   StwInhibitClose [0049]
          2          0          0          0          0   SymGetPointer [0046]
          5         32        282          6         56   MemAllocDebug [0008]
          2         31         31         15         15   MetGetHwnd [0038]
         35         32         32          0          0   EvWinProc [0021]
         32         31         31          0          0   EvFrameProc [0020]
         16         63         63          3          3   AthWindowProc [0003]
         20         94         94          4          4   StwWinProc [0019]
          6          0         64          0         10   MemFreeDebug [0014]
          1          0         93          0         93   SymNewSym [0061]
          1          0        157          0        157   EdtCreateSym [0029]
          1          0          0          0          0   StwPutDiagModify [0055]
          1          0        125          0        125   SymPushUndo [0065]
          1      16247      17031      16247      17031   AthSendSyncMsg [0001]

Each function starts with the name of the function and an ascending index number
to simplify references. Then all functions are listed which call the function
(its parents). The numbers in the parent section say:
   - number of calls from this parent
   - total time spent in the function due to calls from this parent
   - total time spent in the function and all its descendants due to calls from
     this parent
   - average time spent in the function itself due to a call from this parent
   - average time spent in the function and all its des. due to calls from
     this parent
   - name and index of the calling function

Next timing data for the function itself:
   - number of calls to the function
   - total time spend in this function
   - total time spend in this function and all its descendants
   - average time spend in this function per call
   - average time spend in this function and all its descendants
   - name of the analyzed function

Next PERFMON prints a list functions which are (directly or indirectly)
called by the function. The timing data must be interpreted like this:
   - number of calls to the child from the function
   - total time spent in the child function due to calls from the function
   - total time spent in the child and all its descendant due to calls from
     this function
   - average time spent in the child due to a call from the function
   - average time spent in the child and all its descendants due to a call
     from the function
   - name and index of the called child function

PERFMON uses the names like they appear in the source code. Therefore no "_" are
used for function names. All times are in milliseconds.

IMPORTANT NOTE:
Because I have not figured out how to retrieve consumed time for a given thread
all the calculated time values are nearly meaningless in multithreaded programs
and rather meaningless in single threaded programs. Moreover the accuracy of the
OS/2 timer is 32 milliseconds which is by far not enough for this purpose. Many
functions consume less time. My test showed that such function consume no time
while the same function suddenly consumes a multiple of 32 ms if it executes on
the boundary of a time slice.

For all those reasons ignore the time values for the time being. If you know of
way to find out what I need please give me a hint!

REFERENCES

The idea for the analyses part is derived from the SUN UNIX version which
contains a lot of development tools. One of them is "perfmon" which produces
the same analyses (except that they CAN give precise time consumption figures).

GETTING STARTED

All code was compiled and tested with MS 6.0.

To run a test case type
   nmake result.txt
   e result.txt

